Getting Started with GitHub

Jon Bryant

Agenda

  • Why Git/GitHub?

  • Git Basics

  • Workflow

  • Next Steps

  • Resources

  • Q&A

Why Git/GitHub

We already have tools for tracking changes and collaboration for text-based documents. Why do we need version control solutions like GitHub?

Why Git/GitHub

  • Changes in one line of code can drastically impact other areas in non-obvious ways.

  • We want to have a historical record of changes to provide insight for others and our future selves.

  • We want to be able to prototype portions of our code without impacting what is already working in production.

  • We need more than one person to be able to work on a document at once.

Mostly we want to avoid

 

Git

  • Open source software for managing software projects

  • Very low level - Only available through CLI (i.e. Terminal)

  • Provide core functions (tracking changes, commits, pull/push, etc)

GitHub

  • Web-based platform built on top of Git with an easier to use interface, collaboration and issue tracking system, etc.

  • Widely used across the industry

  • A single source of truth

Git Basics

There’s a few key concepts that are important for being successful with Git:

Repository

A centralized folder/directory where all of your code lives for a given project and is tracked by Git.

Commit

A snapshot of all the files in a repository

Pull

Pull changes from the server (remote) that are missing locally.

Push

Push changes (the commit) that exist locally but not on the server (remote)

Repo(sitory)

Repos are a folder/directory that is being tracked by Git.

  • I would suggest creating the repo first in GitHub and then cloning it locally.

    • Cloning makes a local copy of the repo.
  • While you can go from existing project to GitHub, it’s more work so I generally don’t suggest it.

Repo - Create

I recommend creating private repos to start with as public are generally accessible to…well, the public.

Repo - Clone

Click on the Code button to copy the repo’s URL

Repo - Create using Version Control

For RStudio users, I recommend creating a new project using Version Control. We only have to do this once per repo (per machine).

Demo

Commit

After we make changes to files in a local repo that Git is tracking, we can commit those changes.

  • A commit is a snapshot of everything in the local repo that has changed since the last commit.
  • We want to make commits around logical units of change as they should tell a story of how the project has developed.
  • There is no prescribed amount of work or frequency regarding when to commit, but when enough work has been done that it’s worth it to capture that snapshot in time.

Commit - Changes

Commit - UI

Commit - Message

  • First line <50 characters, followed by a space, then additional details if necessary

  • If you’re writing a lot then the commit is too big/infrequent, too verbose, or some of the details should live elsewhere (comments, issues, etc).

Not all files should be tracked

Sometimes, there are files within a directory that we don’t want git to track:

  • Large files

  • Sensitive files

  • Package/Library-related files

.gitignore

When we create a new project in RStudio using version control (in our case GitHub), it automatically creates a .gitignore file with a few key files that don’t need to be tracked by git.

You can add files to gitignore manually or most IDEs will have a UI interface to add them to .gitignore

.gitignore example

These files won’t be tracked by Git nor will they show up in GitHub

Pull/Push

People usually say “Push/Pull”, but I’m being very specific in ordering it as Pull/Push. We always want to pull before we push.

As a reminder:

Pull

Pull changes from the server (remote) that are missing locally.

Push

Push changes (the commit) that exist locally but not on the server (remote)

Why Pull before Push?

What happens in the following scenario:

  1. Jon clones a repo to work on Feature A
  2. Farshad clones the repo to fix Bug B
  3. Farshad is a coding wizard and pushes the fix for Bug B the same day
  4. Two days later, Jon finishes Feature A and wants to push it to the remote.

Merge conflict

If Farshad’s fix for Bug B changes the code enough that it now conflicts with code of my new feature, I will get denied because of these conflicts.

  • This means I need to go back, pull the changes from the remote, make sure my code works properly still for Feature A, and then push it to the remote.

  • This doesn’t necessarily happen often but it’s always a best practice to avoid this by just pulling before pushing.

Pull/Push

 

Demo

What we covered today

  • Why Git/GitHub

  • Basics of Git

  • The complete workflow from start to finish

    • Create/Clone Repositories

    • Commits

    • Pull/Push

Next Time

Issues

GitHub issues are a great way of keeping track of new features, bugs, and other tasks associated to our projects

  • Space to add more context to the code that is index-able for others and our future selves

  • Assign to individuals, tag, etc

  • Link issues to commit

Issues

In the process of creating this presentation, I created an issue for the Posit/Quarto team:

Branches

 

Branches

GitHub is all about branches. We have haven’t discussed it yet but so far we’ve been working only in the repo’s main branch. When working on a feature or bug, we can create a branch and then merge it back to main when we’re done

Projects

Projects are based on Issues, which makes it easier to manage our work

Pages

Resources

Q&A